Initial Results in the Development of SCAN A Swedish Clinical Abbreviation Normalizer
نویسندگان
چکیده
Abbreviations are common in clinical documentation, as this type of text is written under time-pressure and serves mostly for internal communication. This study attempts to apply and extend existing rule-based algorithms that have been developed for English and Swedish abbreviation detection, in order to create an abbreviation detection algorithm for Swedish clinical texts that can identify and suggest definitions for abbreviations and acronyms. This can be used as a pre-processing step for further information extraction and text mining models, as well as for readability solutions. Through a literature review, a number of heuristics were defined for automatic abbreviation detection. These were used in the construction of the Swedish Clinical Abbreviation Normalizer (SCAN). The heuristics were: a) freely available external resources: a dictionary of general Swedish, a dictionary of medical terms and a dictionary of known Swedish medical abbreviations, b) maximum word lengths (from three to eight characters), and c) heuristics for handling common patterns such as hyphenation. For each token in the text, the algorithm checks whether it is a known word in one of the lexicons, and whether it fulfills the criteria for word length and the created heuristics. The final algorithm was evaluated on a set of 300 Swedish clinical notes from an emergency department at the Karolinska University Hospital, Stockholm. These notes were annotated for abbreviations, a total of 2,050 tokens. This set was annotated by a physician accustomed to reading and writing medical records. The algorithm was tested in different variants, where the word lists were modified, heuristics adapted to characteristics found in the texts, and different combinations of word lengths. The best performing version of the algorithm achieved an F-Measure score of 79%, with 76% recall and 81% precision, which is a considerable improvement over the baseline where each token was only matched against the word lists (51% F-measure, 87% recall, 36% precision). Not surprisingly, precision results are higher when the maximum word length is set to the lowest (three), and recall results higher when it is set to the highest (eight). Algorithms for rule-based systems, mainly developed for English, can be successfully adapted for abbreviation detection in Swedish medical records. System performance relies heavily on the quality of the external resources, as well as on the created heuristics. In order to improve results, part-of-speech information and/or local context is needed for disambiguation. In the case of Swedish, compounding also needs to be handled.
منابع مشابه
Swedish Massage and Abnormal Reflexes of Children with Spastic Cerebral Palsy
Objectives: Massage therapy is one of the most widely used complementary and alternative medicine therapies for children. This study was conducted to determine the effect of wedish massage on abnormal reflexes in children with spastic cerebral palsy (CP). Methods: This study was a single blind clinical trial conducted on forty children with spastic CP who were recruited from clinics of the U...
متن کاملThe Effect of Swedish Massage on Glycohemoglobin in Children with Diabetes Mellitus
Objectives: Diabetes mellitus (DM) is the most common endocrine disease in children. Massage therapy can improve glucose metabolism in DM. This study was conducted to determine the effect of Swedish massage on the Glycohemoglobin (HbA1c) in children with DM. Methods: This study was an semi-experimental (clinical trial) conducted on thirty-six children, 6-12 years old with (DM), recruited fro...
متن کاملEvaluation of CT angiography findings in the patients with clinical diagnosis of pulmonary thromboembolism
Background and Aim: Pulmonary thromboembolism is an important clinical problem in the patients after major surgeries and is often difficult to diagnose because of nonspecific clinical symptoms. Diagnosis of pulmonary embolism is based on medical imaging methods. The aim of this study was to evaluate the results of CT pulmonary angiographies of the patients with primary clinical diagnosis of pul...
متن کاملDevelopment of Latrodectus Envenomation Severity Score (LESS); a Severity Index for Widow Spider Bite: Initial Step
Background: In order to describe the patients and evaluate the effectiveness of treatments for widow spider envenomation, investigators require a reliable assessment tool. In this paper, the development of a clinical index for measuring the widow spider bite severity, Latrodectus Envenomation Severity Score (LESS), is described. Methods: According to the valid methods for index development, a D...
متن کاملClinical Value of Low Dose CT- Scan in Pediatric Chest Diseases: Adequacy Assessment
Background Radiation dose about 400 times that of standard thoracic computed tomography (CT) in comparison with chest X- ray resulting in different approaches to decrease radiation dose have been established in the last few years to prevent possible side effects especially in children, such as low dose protocols. The aim of this study was assessment of clinical v...
متن کامل